Differentially Private Testing of Identity and Closeness of Discrete Distributions

نویسندگان

  • Jayadev Acharya
  • Ziteng Sun
  • Huanyu Zhang
چکیده

We study the fundamental problems of identity testing (goodness of fit), and closeness testing (two sample test) of distributions over k elements, under differential privacy. While the problems have a long history in statistics, finite sample bounds for these problems have only been established recently. In this work, we derive upper and lower bounds on the sample complexity of both the problems under (ε, δ)-differential privacy. We provide optimal sample complexity algorithms for identity testing problem for all parameter ranges, and the first results for closeness testing. Our closeness testing bounds are optimal in the sparse regime where the number of samples is at most k. Our upper bounds are obtained by privatizing non-private estimators for these problems. The non-private estimators are chosen to have small sensitivity. We propose a general framework to establish lower bounds on the sample complexity of statistical tasks under differential privacy. We show a bound on differentially private algorithms in terms of a coupling between the two hypothesis classes we aim to test. By constructing carefully chosen priors over the hypothesis classes, and using Le Cam’s two point theorem we provide a general mechanism for proving lower bounds. We believe that the framework can be used to obtain strong lower bounds for other statistical tasks under privacy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Differentially Private Identity and Closeness Testing of Discrete Distributions

We investigate the problems of identity and closeness testing over a discrete population from random samples. Our goal is to develop efficient testers while guaranteeing Differential Privacy to the individuals of the population. We describe an approach that yields sample-efficient differentially private testers for these problems. Our theoretical results show that there exist private identity a...

متن کامل

Optimal Algorithms for Testing Closeness of Discrete Distributions | Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms | Society for Industrial and Applied Mathematics

We study the question of closeness testing for two discrete distributions. More precisely, given samples from two distributions p and q over an n-element set, we wish to distinguish whether p = q versus p is at least ε-far from q, in either `1 or `2 distance. Batu et al [BFR00, BFR13] gave the first sub-linear time algorithms for these problems, which matched the lower bounds of [Val11] up to a...

متن کامل

Near-Optimal Closeness Testing of Discrete Histogram Distributions

We investigate the problem of testing the equivalence between two discrete histograms. A k-histogram over [n] is a probability distribution that is piecewise constant over some set of k intervals over [n]. Histograms have been extensively studied in computer science and statistics. Given a set of samples from two k-histogram distributions p, q over [n], we want to distinguish (with high probabi...

متن کامل

On discrete a-unimodal and a-monotone distributions

Unimodality is one of the building structures of distributions that like skewness, kurtosis and symmetry is visible in the shape of a function. Comparing two different distributions, can be a very difficult task. But if both the distributions are of the same types, for example both are unimodal, for comparison we may just compare the modes, dispersions and skewness. So, the concept of unimodali...

متن کامل

Classification and properties of acyclic discrete phase-type distributions based on geometric and shifted geometric distributions

Acyclic phase-type distributions form a versatile model, serving as approximations to many probability distributions in various circumstances. They exhibit special properties and characteristics that usually make their applications attractive. Compared to acyclic continuous phase-type (ACPH) distributions, acyclic discrete phase-type (ADPH) distributions and their subclasses (ADPH family) have ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1707.05128  شماره 

صفحات  -

تاریخ انتشار 2017